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GENDER BIAS IN THE EVALUATION OF 
MALE AND FEMALE POLICE OFFICER PERFORMANCE 



Following an arrest attempt in which a male training officer was shot, the 
female trainee- partner was fired for "cowardice." The authors were 
contacted to assist In the evaluation of the fired female officer's Title VII 
claim that she was the victim of a biased appraisal process. A review of 
the literature on anti-female bias in performance appraisals revealed 
conflicting results, suggesting a number of possible moderators of the 
effect, but providing little empirical basis for a definitive opinion in the 
instant case. Accordingly, the present study was undertaken to evaluate 
the likelihood of anti-female bias in the present set of factual 
circumstances. Specifically, is a female compared to a male trainee likely 
to be more harshly evaluated in the present circumstances? 

Based on investigative reports, depositions, and eye-witness accounts in 
the above litigation, four versions of a written scenario were prepared 
which chronicled the actual arrest sequence involving the two police 
officers and the suspect. The four scenarios were identical except that the 
names and pronouns used to describe the two officers were altered to 
produce the four possible gender pairings. 

A behaviorally-anchored performance rating form was developed which 
required the respondent to evaluate the performance of each of the two 
officers five times by selecting one of seven possible administrative 
actions ranging from termination to a meritorious performance award 
recommendation. The five evaluations consisted of an evaluation of each 
of four phases of the arrest sequence as well as an overall evaluation. 

One of the four scenarios was randomly selected and then sent to the 
police chief in each of the 226 U.S. cities with populations greater than 
eighty thousand. Instructions indicated that the instrument should be 
completed either by the chief or some other senior officer familiar with 
proper police procedures and e^tperienced in the evaluation of police officer 
performance. Respondents were told only that our purpose was to 
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explore "human decision processes in the performance evaluation 
context.** No indication of the litigious underpinnings of the study was 
provided. 

One hundred fifty- seven (70%) scenario ratings wtrn returned. The 
results of a 2 (field training officer gender) by 2 (trainee gender) ANOVA 
of the training officer (FTO) overall performance ratings is shown in 
Table 1. A significant interaction was detected. 

Table 1. Anova Summary Table for FTO Overall Performance Rating 







Mean 




Signif 


Source of Variation 


OF 


Square 


F 


of F 


FTO 


1 


.370 


.258 


.612 


TRAINEE 


1 


4.811 


3.349 


.069 


FTO X TRAINEE 


1 


9.146 


6.365 


.013 


Residual 


153 


1.437 







The analysis reveals that while neither the gender of the FTO nor the 
gender of the trainee alone had a s'^'nificant effect on the FTO rating, the 
particular gender mix of the team, the FTO by Trainee interaction, did 
sigrJficantly impact the FTO ratings. The means for each gender 
combination are shown in Table 2. 



o 

ERIC 



Table 2. Mean FTO Performance Rating by Treatment Condition 

Male FTO Female FTO 



Male Trainee 
Female Trainee 



2.51 


2.92 


3.38 


2.81 



A post-hoc analysis revealed what is apparent from inspection of the cell 
means. The performance of the FTO is rated significantly higher when 
a male FTO is paired with a female trainee than in any other gender 
comlljination. 



A parallel two-way ANOVA was performed on the trainee overall 
performance rating. Tables 3 and 4 reveal that the trainee overall 
performance rating was not effected by the gender mix of the arresting 
team. The means displayed in Table 4 are similar across the four 
conditions. 



Table 3. ANOVA Summary Table for Trainee Overall Performance 
Rating 







Mean 




Signif 


Source of Variation 


DF 


Square 


F 


of F 


FTO 


1 


.735 


.468 


.495 


TRAINEE 


1 


.965 


.615 


.434 


FTO X TRAINEE 


1 


.003 


.002 


.963 


Residual 


153 


1.569 







Table 4. Mean Trainee Performance Rating by Treatment 
Condition 

Male FTO Female FTO 

Male Trainee 
Female Trainee 



3.21 


3.33 


3.35 


3.50 



Together the results displayed in Tables 1 through Table 4 indicate the 
nender composition of the arresting team does indeed impact the relative 
performance evaluation of team members. Though the trainee's 
evaluation is not direct;y effected, the FTO performance evaluation is 
elevated when the FTO is male and the trainee is female. In this 
condition alone, the male FTO's performance is rated significantly higher 
than in any other gender combination. This effect is graphically displayed 
in Figure 1. 



Figure 1. Plot of FTO and Trainee Mean Performance 
Rating by Team Gender Composition 
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Light is also shed on the related question of whether the female trainee 
is relatively more harshly evaluated. Recall that we did not find that the 
gender of the trainee directly affects the trainee evaluation. However, 
four paired-samples t-tests confirmed that the trainee's performance is 
regarded as signincantly better than the FTO's performance in three of 
the four conditions (p<.01). The one gender combination where this 
relationship does not occur is when a female trainee is paired with a male 
FTO. In this team, the very same male FTO behaviors are evaluated 
more favorably than in the other three conditions. This finding was 
particularly relevant in the present context because the one condition 
where the trainee's performancv^s suffers by contrast with the FTO's 
performance is precisely that gender combination that was involved In the 
arrest incident that led to the female trainee's termination. 



An additional analysis focused on the decision to terminate the trainee, 
and whether this decision es made without regard to the trainee's gender. 
A contingency analysis was performed on the dichotomized survey ratings 
of the trainee's overall performance. Consistent with the rating scale 
anchors* trainee performance ratings of "2" or greater were coded as a 
"retain" decision, ratings of "i" were coded as a termination decision. 
We then examined the frequency with which female trainees were 
terminated compared to male trainees. Table 5 displays both the 
obtained frequency and in parentheses, the frequency one would expect 
given a gender neutral evaluation process. 

While only 7% of the respondents indicated the trainee should be 
terminated, a disproportionate number of the terminations occurred for 
female trainees. Of 11 trainee termination decisions, nine of these were 
rendered for female trainees. Only two termination recommendations 
were returned for male trainees. 



Table 5. Contingency Analyses of Termination/Retention 
Decisions by Gender of Trainee 



FEMALE 

MALE 



FIRED RETAINED 



9 


73 


2 


73 



To examine whether the obtained departure from the expected 
frequencies was sufficiently large to confirm the hypothesis that female 
trainees are more frequently terminated than male trair s, a chi-square 
test of statistical significance was performed. The cLi-square test was 
significant (X^=4.15, p<.05). Police departments were more inclined t© 
discharge female than male trainees for precisely the same behaviors. 

Taken together, the data from this study raise serious questions about 
the influence of sex bias in police officer performance evaluations. The 
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data show that police departments, perhaps quite unconsciously, do 
permit gender to influence their assessment of the effectiveness of an 
officer's performance. Precisely the same behaviors in the arrest scenario 
were evaluated differently depending on the gender combination of the 
team. The data show that the performance of the trainee is regarded as 
superior to that of the FTO except when the FTO is male and the trainee 
is female, in this latter case, the female trainee is evaluated less 
favorably than the male FTO. 

More disturbing is the finding that, for precisely the same actions, police 
departments are significantly more lilcely to terminate female trainees 
than male trainees. This predisposition toward disparate treatment raises 
serious questions about the even- handedness female police officer's can 
expect in this maSe stereotyped job. 

Disquieting as these findings are, a note of caution in generalizing these 
findings is warranted. It is worth remembering that, though the facts 
depicted in the scenario are "real", respondents were nevertheless 
evaluating "paper people" in a research context. We, of course, cannot 
be sure that the same gender-based treatment differential occurs in the 
day to day operations of police departments throughout the U.S. 

Given the inconsistent pattern of findings in the literature, the 
demonstration of gender bias in the present study suggests three 
potential moderators worthy of further investigation. Gender bias may 
be more likely a) in situations involving strongly gender stereotyped 
performance settings (e.g., dangerous situations involving physically 
demanding actions), b) where team performance forces the rater to 
apportion responsibility for outcomes among team members, and c) 
where performance judgments are directly linlced to specinc 
administrative actions. 
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